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Abstract: Image data has been considered as a vital source of 
information with far-fetched growth of Information 
Technology. World Wide Web has facilitated easy and round 
the clock access of data. Archiving of image data in good 
proportion has been made possible with high capacity storage 
devices and communication links. Time and efficiency have 
been considered as most important factors for information 
recognition from these datasets. The huge numbers of 
information databases have diverse categories of image data. 
Limited number of major categories can be formed based on 
the contents of the images with the help of image 
classification. The authors have proposed two novel 
techniques of feature extraction in this work and have 
compared the same with the existing techniques of feature 
extraction for classification results. The proposed techniques 
have exhibited higher performance efficiency compared to the 
state-of-the art techniques and have principally contributed to 
boost up classification performance. 
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1. Introduction 

Image capturing devices has undergone a paradigm shift in 
terms of innovation and intelligence. This eventually has given 
rise to huge number of image data with every passing day. 
Various applications including military services, criminology, 
entertainment, education, etc. find the image data as a wealthy 
source of information for numerous exercises. Increased 
efficiency has been imperative for effective utilization of the 
rich information hidden in image datasets [1]. Efficient use of 
image data required ready access and prompt retrieval. A 
supervised and systematized database with limited number of 
major classes [2] can be an useful tool for increased 
competence in searching of images from an image database. 
Classification of images inside the database has stimulated the 
searching process efficiency. The authors have proposed two 
novel techniques of feature extraction in this paper and have 
compared the classification performance of the techniques with 
the state-of-the art techniques of feature extraction. The 
proposed techniques have outperformed the existing techniques 
and have shown rise in classification performance. 

2. Related Work 

Interaction with the image data in terms of features has strongly 
narrated the inherent properties of the images [21]. Feature 



extraction was required for classification of images in 
heterogeneous collections of image data. The term "feature" in 
this context may be considered as colour, shape, texture of an 
image. Averaging and Histogram techniques were used to 
realize the colour facet of an image [3,4,5]. Texture can be 
obtained by using transforms [6,7] or vector quantization [8,9]. 
Shape aspects were achieved with gradient operator or 
morphological operator [10]. Earlier approaches have studied 
K-means clustering using Block Truncation Coding (BTC) and 
colour moments to classify images into various categories [11]. 
Mean threshold based techniques and global threshold based 
techniques were predominantly used for feature extraction by 
image binarization[12,13,14,15,16]. Feature extraction for 
image categorization was carried out with local threshold 
selection for binarization in case of unevenly illuminated or 
stained images[17,18,19,20]. The proposed methods have 
surpassed the classification results for feature extraction with 
exiting techniques and have revealed greater classification rate. 

3. Block Truncation Coding 

Block Truncation coding has been considered as a simple 
compression algorithm which has primarily segmented the 
image into nxn (typically 4x4) non overlapping blocks [23, 24]. 
The algorithm was developed in the year 1979, at the early 
stage of image processing. It was developed for the grayscale 
images and later extended for colour images. In this algorithm 
the blocks were coded one at a time. The reconstructed block 
comprised of new values calculated from the mean and 
standard deviation for each block. The value of mean and 
standard deviation remained same as of the original block. The 
proposed work has extracted each color component Red(R), 
Green(G) and Blue(B) as a block as in Fig. 1 and has 
implemented the concept of block truncation coding. 



Fig. 1 Red, Green and Blue Component as blocks 
4. Proposed Method 

Two different methods of threshold selection were proposed 
in the following subsections. 

4.1. Static Thepade's Ternary BTC(STTBTC) 
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The proposed method has followed Block Truncation 
Coding (BTC) by dividing the image into Red(R), Green (G) 
and Blue (B) components and treating each of the components 
as blocks. Primarily, threshod values for each color component 
were calculated as in equation 1. The overall luminance 
threshold was calculated from the individual threshold values 
of each color component as in equation 2. The process was 
followed by calculation of individual color threshold intervals 
as illustrated in equation 3 and 4. Calculation of color threshold 
intervals involved the alteration of degree value n every time 
for calculating upper and lower threshold values. The value of n 
was in the range of 1 to 5. 



j m n 
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overall 



TJo = T-n\T-T 
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(3) 



(4) 



where n= 1,2,3,4 and 5 

x- Red(R), Green(G) and Blue(B) for each color 
component 

4.2. Dynamic Thepade's Ternary BTC (DTTBTC) 

The technique has differed from Static Thepade's Ternary 
BTC in calculating the degree value n associated with threshold 
calculation. The absolute ratio of the threshold for each color 
component to the overall threshold value of luminance was 
calculated to determine the value of n as given in equation 5. 
Three different values were received from equation 5 for three 
color components and were compared to each other to select 
the largest value out of the three. The largest value was further 
considered as the degree value n to be associated dynamically 
for threshold calculation. 
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4.3. Feature Extraction 

In case of ternary BTC a value 'one' was allotted to the 
corresponding pixel position if a pixel value of respective color 
component was higher than the respective higher threshold 
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interval (Txhi). A lesser pixel value than the respective lower 
threshold interval (Txlo), corresponds to a value of 'minus one' 
for the consequent pixel position of the image map; else it gets 
a value 'zero'. The process has been shown in equation 6 



1 if x(i,j)>T x hi 
0 if TJo <= x(i,j) <=T x hi 
-1 if x(iJ)<T x lo 



(6) 

The mean of the values of the three clusters thus formed were 
taken as the feature vectors. Thus the number of feature vectors 
for each color component was three and on the whole nine 
feature vectors were generated for three color components for 
each image in the dataset as in equation 7-9. 
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5. Experimentation Evaluation 



The experimentation process for the proposed feature 
extraction technique was implemented using Matlab 
7.11.0(R2010b) with Intel core i5 processor having 4 GB 
RAM. Evaluation of classification results were performed with 
a neural network classifier namely multilayer perceptron as 
shown in Fig. 2. 




Fig. 2 Multilayer Perceptron 

Fig. 2 has shown a perceptron having three inputs including a 
bias input with three different weights of 3, 2 and -6 
respectively [22]. The function of activation is f4 which was 
applied to the value S = 3x1+3x2-6. A unipolar step activation 
function has given the assessments of f4 as shown in equation 
10. 
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1 if...S>0 



0 otherwise 



(10) 



The classification process has followed a 10 fold cross 
validation process. The process has primarily divided the entire 
dataset into 9 training set and 1 testing set and repeated the 
process for 10 trials. The final results for classification was 
deduced by averaging the 10 outputs received from the 10 
iterations of cross validation [25]. 

A widely used public dataset named Wang dataset was 
considered for the evaluation purpose as in fig. 3. The dataset 
comprised of 10 categories with 100 images in each category. 
On the whole 1000 images were considered for the assessment 
work. 





4 




Fig.3 Sample of Wang Dataset 

The metrics for evaluation were considered to be Precision and 
Recall as in equation 11 and 12. Accuracy of the classification 
process was measured with precision and the completeness of 
the process was measured by recall. 



Precision = 



Re call 



No .of Re levant Im ages 
TotalNo.of Images Re trieved 



(id 



No .of relevant Im ages Re trieved 
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(12) 

6. Results and Discussion 

The proposed techniques of feature extraction were 
primarily compared to each other as seen in Fig. 4. It was 
observed that Dynamic Thepade's Ternary BTC(DTTBTC) has 
exhibited the highest precision and recall values when 
compared to the different degrees of Static Thepade's Ternary 
BTC(STTBTC). However, the precision and recall values for 
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degree 1 of STTBTC was highest compared to the remaining 
degrees of STTBTC as shown in Fig. 4. 



Comparison of STTBTC to DTTBTC for 
Classification Performance 
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Recall 
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0.67 



Fig. 4 Comparison of precision and recall values of 
classification for STTBTC and DTTBTC 

Further, different color spaces were considered for the 
comparison of precision and recall values for classification of 
Dynamic Thepade's Ternary BTC (DTTBTC). It was found 
that the precision and recall values for RGB color space was 
maximum compared to five other color spaces namely, LUV, 
YCbCr, YUV, YIQ and YCgCb. The graphical illustration of 
the comparison has been shown in fig. 5. 



Comparison of DTTBTC in different 
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Fig. 5 Comparison of precision and recall values of 
classification of DTTBTC with different color spaces. 
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Finally, the feature extraction method performed by using 
Dynamic Thepade's Ternary BTC (DTTBTC) was compared 
for percentage values of precision and recall for classification 
with the state of the art techniques of feature extraction as 
shown in Fig. 6 



Fig.6 Comparison of precision and recall values of 
classification of DTTBTC with different color spaces. 

The comparison shown in Fig. 6 has clearly established the 
supremacy of the proposed technique over the existing 
techniques. The proposed technique has outperformed the 
existing techniques both in case of precision and recall values 
and has extended significant contribution towards improvement 
of classification results. 

7. Conclusion 

Image classification has its increasing importance due to 
the growing size of visual databases with the advent of high end 
image capturing devices and social media. It has necessitated 
the unstructured image data to be analyzed and interpreted 
properly for interpretation of important and relevant timely 
information. The authors have proposed two novel techniques 
for feature extraction from the visual data which has 
predominantly increased the classification performance. The 
work can be extended as a precursor for content based image 
retrieval where the data can be classified for maximizing the 
probability of relevant searching in minimized space with 
decreased time. 
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